Skip to content

Mixed-precision parallel linear solver#7048

Open
nrseman wants to merge 13 commits into
OPM:masterfrom
haugenlabs:mixed-istl
Open

Mixed-precision parallel linear solver#7048
nrseman wants to merge 13 commits into
OPM:masterfrom
haugenlabs:mixed-istl

Conversation

@nrseman
Copy link
Copy Markdown

@nrseman nrseman commented May 12, 2026

This PR leverages OPM's ISTL framework to provide a parallel implementation of the mixed-precision linear solver. For more information refer to opm/simulators/linalg/mixed/README.md

@GitPaean GitPaean added the manual:irrelevant This PR is a minor fix and should not appear in the manual label May 12, 2026
@atgeirr atgeirr added manual:new-feature This is a new feature and should be described in the manual and removed manual:irrelevant This PR is a minor fix and should not appear in the manual labels May 13, 2026
Copy link
Copy Markdown
Member

@SoilRos SoilRos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good, I just left some minor concerns. I still need to make the performance tests though.

* @param A Pointer to bsr matrix.
* @param x Pointer to input vector.
* @param y Pointer to output vector.
*/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the implementation correctly, the blocks must be already transposed. If that is the case, that should be written in this part of the documentation. I think the same holds for the other matrix vector product.


vA[0] += _mm256_cvtps_pd(_mm_loadu_ps(AA+0))*_mm256_permute4x64_pd(vx,0b00000000); //0b01010101
vA[1] += _mm256_cvtps_pd(_mm_loadu_ps(AA+3))*_mm256_permute4x64_pd(vx,0b01010101); //0b01010101
vA[2] += _mm256_cvtps_pd(_mm_loadu_ps(AA+6))*_mm256_permute4x64_pd(vx,0b10101010); //0b01010101
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are those comments for? Maybe just comment above those that this is the transposed block matrix vector product.

{
cols[icol++] = row->getindexptr()[i];
}
rows[irow+1] = rows[irow]+row->getsize();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the new matrix, it will better to do this loop with the iterators and their index() method instad of the internal member functions of the row:

for(auto col = row.begin(); col != row.end(); ++col)
{
    cols[icol++] = col.index();
}
rows[irow+1] = icol;

//! @tparam Vector the block-vector used by linear operator
//! @tparam b block size
template <class Vector, int b>
class MatrixWrapper
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this to a namespace?
Also, from the point of view of the user of this class, it is irrelevant that it wraps the c implementation or not. What is important is that its storage of 3x3 blocks is downcasted to single precision and vectorized. So I would call it something else, like MixedMatrixWrapper for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

manual:new-feature This is a new feature and should be described in the manual

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants